Image registration is the process of transforming two or more images (or volumes) so that they are in alignment with one another. This is done to perform quantitative comparisons between the images. Image registration can be applied in a number of contexts. At the Mouse Imaging Centre (MICe) we use registration to study mesoscale differences in neuroanatomy using MRI scans. This is accomplished using a technique called deformation-based morphometry (DBM). With DBM, we can compare morphological changes between different objects, e.g. mouse brains, by transforming or deforming one object to match another. This transformation from the first object, \(\mathbf{A}\) to the second, \(\mathbf{B}\), contains information about the ways in which \(\mathbf{A}\) had to change to match \(\mathbf{B}\). Using these transformations, we can extract measures of comparison, like volumetric differences. The goal of image registration is to come up with as good a transformation as possible so that the difference between the final images is minimal.
Image registration is built upon a rigorous mathematical foundation. However, this formalism makes assumptions that don’t hold in real life. Since it is impossible to generate a perfect registration, the goal in practice is to create a pipeline that is as robust as possible, in order to ensure success and minimize errors.
Image registration has three primary goals. Ordered from most general (i.e. not specific to mouse brain images or even brain images) to most specialized (i.e. applied to mouse brains here at MICe), they are:
1. Establish point-by-point correspondence between images.
In other words, we want to be able to map every unique point (i.e. pixel or voxel) in one image to a point in another image. The core goal of registration is to find correspondence between image features.
The white point in the central image is associated with corresponding points in the other four images
2. Create a high-quality consensus average.
A consensus average is a representative average image generated from all images being compared. When working with multiple images, we bring homologous points in all images together towards an average. Building a consensus average allows the idiosyncrasies of individual brains to be smoothed out. This average is used as the reference for image comparison. The consensus average is also preferable to individual MR images for visualizing quantitative maps over the brain.
A coronal slice of a consensus average
Though it is advantageous, it isn’t necessary to build a consensus average. Another strategy for image comparison would be to select one of the individual MR scans as the reference and compare all other images to that image. If it was possible to perfectly identify point-by-point correspondence between all images, then we could use any of the images as a reference. This isn’t the case in practice. Using a particular image as a reference will cause the idiosyncrasies of that image to propagate to all other images in the transformations. For example, if a reference image has an extra structure in the brain, this error will be propagated. Using a consensus average as the reference image allows us to smooth out these particularities. Another way to compare images would be to same all sets of pairwise transformations between images. However this is not very efficient.
3. Perform deformation-based morphometry.
Once point-by-point correspondence is established, we can extract measures of deformation indicating how each of the individual images is different from the consensus average. This information can be used to compute volumetric information.
Following a successful registration, all three of these goals will have been accomplished. However, there is no real standard to assess when a registration is complete. It is always a good idea to examine the consensus average generated from the registration pipeline. If the average doesn’t look good, then you likely don’t have point-by-point correspondence. Point-by-point correspondence is usually inferred from the quality of the average rather than proven. Another way to perform a quality check is to look at the transformed individual images.
The following sections will go into the details of how we perform image registration and deformation-based morphometry. We begin by considering the case of pairwise registration. Multi-image registration will be discussed in later.
As mentioned above, the goal of image registration is to align images together. With pairwise image registration we can accomplish this by transforming one image, which we call the source, to be in alignment with the second image, which we call the target.
How can we generate a good registration between these two images? The key is not to perform the entire registration at once, but rather to break the process down into pieces. Specifically, we begin by performing a rapid coarse alignment between images. The alignment is then refined using a series of more complicated transformations. At MICe, pairwise registration is performed using three steps:
We discuss each of these in turn.
In order to align two images together, the first, easiest thing to do is to make sure that the two images are roughly in the same orientation and at the same position in space. This is accomplished by rotating and translating the image. Collectively, these comprise a rigid transformation, i.e. a linear transformation that preserves volume elements. At MICe we obtain a rigid alignment between images using least squares optimization with 6 degrees of freedom (LSQ6). This algorithm finds the optimal rigid transformation with 6 degrees of freedom that aligns the source with the target. The six degrees of freedom in the transformation correspond to three rotations (about \(x\), \(y\), and \(z\)) and three translations (along \(x\), \(y\), and \(z\)). Once the optimal transformation has been identified, it is applied to the source image to bring it into alignment with the target. Conceptually, this just means that the source image is rotated and translated until it is in optimal alignment with the target image:
As mentioned previously, this step generates a rough alignment between the target and source. This can be seen if we try to build a consensus average using the transformed source image and the target. The overlayed image is blurry and the structures aren’t nicely resolved.
This is a good first step, but to obtain better point-by-point correspondence and build a nicer average, the alignment needs to be refined.
Following coarse alignment using a rigid transformation, the source and target images are registered together more finely using an affine transformation. As with rigid transformations, affine transformations are linear transformations that affect every voxel in the same way. However, affine transformations do not preserve volume elements. This is because affine transformations allow for scaling and shearing, in addition to rotation and translation:
Affine transformations do however preserve parallelism, meaning that lines that are parallel prior to the transformation will remain parallel afterwards. The optimal affine transformation is identified using least squares optimization with 12 degrees of freedom (LSQ12). These twelve degrees of freedom include the three rotations and translations mentioned previously, but also scaling along each axis and shearing across each pair of axes. Alignment using LSQ12 is performed using the LSQ6-resampled, i.e. LSQ6-transformed, image as the source. This means that the bulk of the alignment has already been performed. The rotations and translations performed here will serve to refine the coarse alignment that was obtained using LSQ6. The source image will also be scaled to the appropriate size, and any necessary shears will be applied. Once the LSQ6-transformed source image is transformed using the optimal LSQ12 transformation, the source and target will actually be really well aligned.
In fact it is possible to obtain excellent point-by-point correspondence simply by using affine transformations. In mice, a lot of brain differences come down to differences in overall brain size, which are resolved following affine transformations. However, affine transformations are limited in their ability to generate a highly precise alignment between images. This is because they influence all voxels in the same way. Whatever changes are applied to one part of the image must also be applied to all other parts. One can easily imagine a scenario in which aligning a given region of the brain, e.g. the cerebellum, with the target would take another region of the brain, e.g. the olfactory bulbs, out of alignment. In order to achieve perfect alignment between images, we need to allow different parts of the image to be transformed in different ways. This requires the use of non-linear transformations.
While rigid and affine transformations have embedded constraints regarding how voxels in the image can move, non-linear transformations are extremely flexible. In fact, non-linear transformations allow every voxel to be transformed independently of the others. Though this flexibility is necessary to generate a precise alignment between images, it can also be problematic when applied naively. Specifically, the leniency of non-linear transformations makes it difficult to find an optimal solution to the registration problem. Moreover, the solutions obtained are not guaranteed to be physical. For instance, if every voxel in the image is allowed to move independently of the others, it is possible to obtain excellent point-by-point correspondence by transforming both the source and target to a greyscale gradient.
This technically solves the registration problem with high accuracy but it obviously isn’t what we want. Though the images are registered together, all physical information is lost. In order to generate a solution that is both accurate and physical, we need to constrain the way in which voxels can be transformed. Part of this process has already been accomplished with the rigid and affine transformations discussed above. With the images in near-perfect alignment, it is possible to apply a constrained non-linear transformation to smooth out the remaining details. In particular, three constraints are considered:
While alignment using non-linear registration can be performed on the images themselves, it can also be performed on the gradient map of the images, i.e. the edges. To generate a better alignment, the intensity images and the gradient images can be registered simultaneously. Different weights can be applied to each class of image in the optimization process.
Intensity and gradient (red) images
Following non-linear registration the source and target images will be in near-perfect alignment. This completes the registration process. In doing so, we have established point-by-point correspondence between images, and can create a high-quality consensus average. The remaining step is to use the optimal LSQ6, LSQ12, and non-linear transformations to extract volumetric information in order to make quantitative comparisons between the source and target.
In order to understand how deformation-based morphometry is used to extract volumetric measures from the transformations described above, it is important to review how MR images are stored and described. Since MR images are acquired in three dimensions, we can store them in a 3-dimensional array, i.e. a data cube:
Each element of this array represents a sampled voxel, and the value stored in a given element is the intensity value of the voxel. Each dimension along the array represents a dimension in real space, i.e. \(x\), \(y\), or \(z\). Any element of the array can be accessed using a set of three indices, e.g. myArray[i, j, k]. The set of indices \([i, j, k]\) represents the voxel coordinates of a given voxel. Since the MR image is a digital representation of an object that exists in the real world, each voxel is associated with a position in real space. Thus each voxel in the array has an associated set of world coordinates, \([x, y, z]\).
So how does this relate to image registration and deformation-based morphometry? Previously we discussed how various transformations can be applied to align our images. What this means in practice is that the transformations are applied to every voxel in an image. In particular, the transformations are represented as mathematical rules that act on the world coordinates of individual voxels. Generally, a transformation can be described as a multivariate function that takes a voxel’s initial coordinates, \(\vec{x} = [x, y, z]\), and returns its new coordinates, \(\vec{X} = [X, Y, Z]\). This is exactly described by a vector field over the image:
\[\vec{X}(\vec{x}) = \left[X(\vec{x}), Y(\vec{x}), Z(\vec{x}) \right] = \left[X(x, y, z), Y(x, y, z), Z(x, y, z) \right] \quad .\]
The components of the vector field, i.e. \(X(\vec{x})\), \(Y(\vec{x})\), and \(Z(\vec{x})\), describe the voxel’s new position in each dimension. Each of the transformations discussed above can be described in this way. However, the specific form of the mapping \(\vec{X}(\vec{x})\) will depend on the type of transformation under consideration. We’ll begin by considering the general case of non-linear transformations before moving on to the specific cases of rigid and affine transformations, which are instances of linear transformations.
Non-linear transformations
In non-linear registration, we allow all voxels to move independently of the others, subject to some regularizing constraints. As mentioned above, each voxel begins at some set of world coordinates \(\vec{x} = [x, y, z]\) and moves to a new location \(\vec{X} = [X, Y, Z]\). This is described by the vector field \(\vec{X}(\vec{x})\). The “non-linear” designation of non-linear transformations means that the coordinate functions of the vector field, \(X(\vec{x})\), \(Y(\vec{x})\) and \(Z(\vec{x})\), can take on any functional form. The functions aren’t restricted to be linear functions of \(\vec{x}\). We do apply regularizing constraints, as discussed previously, but these constraints don’t impose linearity. Here is an example of a non-linear vector field in two dimensions, generated using \(\vec{X}(\vec{x}) = [X(\vec{x}), Y(\vec{x})] = [\sin(y), \sin(x)]\):
The non-linear transformation is completely characterized by the vector field. Given the initial position of any voxel in the image, we can indicate where it will be following the transformation. This function contains a lot of information however. In essence we have three maps over the image, one for each component of the vector field:
Each map describes the transformation along one axis. As you might appreciate, this doesn’t provide a simple and intuitive way to understand how the source image was deformed to match the target image. It would be ideal to instead have a single map that describes how every voxel in the source had to be deformed to meet the target. In fact, we can build such a map by looking at the Jacobian determinants of the transformation. To compute the Jacobian determinants, we first have to compute the Jacobian matrix associated with the transformation \(\vec{X}(\vec{x})\) (named after Carl Gustav Jacob Jacobi, a German mathematician from the 19th century). The Jacobian matrix is built by taking the partial derivatives of the components of the vector field \(\vec{X}(\vec{x})\) along each dimension:
\[\mathbf{J}(\vec{x}) = \begin{bmatrix} \frac{\partial X(\vec{x})}{\partial x} & \frac{\partial X(\vec{x})}{\partial y} & \frac{\partial X(\vec{x})}{\partial z} \\ \frac{\partial Y(\vec{x})}{\partial x} & \frac{\partial Y(\vec{x})}{\partial y} & \frac{\partial Y(\vec{x})}{\partial z} \\ \frac{\partial Z(\vec{x})}{\partial x} & \frac{\partial Z(\vec{x})}{\partial y} & \frac{\partial Z(\vec{x})}{\partial z} \end{bmatrix} \quad . \]
The Jacobian effectively plays the role of a first-order derivative for the transformation. As such, it describes the behaviour of the transformation for a very very small (infinitesimal) region of space around \(\vec{x}\). Since the Jacobian matrix is defined at every world coordinate, it forms a tensor field over the image.
In building this object, we seem to have made things more complicated: Instead of having three individual maps over the image, we now have nine. However, since the Jacobian is a square matrix, we can compute its determinant, i.e. the Jacobian determinant. The determinant of a matrix of numbers is a single value that encodes certain properties of the transformation represented by the matrix. Since the Jacobian matrix is tensor field, the determinant of the matrix is also a function. In particular, it forms a scalar field, \(\det[\mathbf{J}(\vec{x})] \equiv g(\vec{x}) = g(x, y, z)\), which returns a single value at every world coordinate. This is what we want. The Jacobian determinant field provides a single map that describes how every voxel in the source was deformed to align with the target.
The value of the determinant field at a given point can be interpreted as a scaling factor applied to a volume element or voxel. So a value of 1 means that the volume of the voxel is unchanged, a value greater than 1 indicates expansion, and a value between 0 and 1 indicates contraction. In theory, the value of the determinant at any given voxel can be anywhere between \(-\infty\) and \(\infty\). A negative determinant would indicate that a volume element, i.e. voxel, has been inverted. In practice, this does not happen with the image registrations that we perform; the determinant is always positive. If the image registration process generates some negative values for the determinant, it typically indicates that something went wrong.
Note that at MICe, we typically work with the log-transformed determinants, \(\ln[g(\vec{x})]\). This symmetrizes the range of values for contraction and expansion, such that negative values indicate contraction, positive values indicate expansion, and a value of 0 indicates no change.
Aside: Khan Academy offers some useful short lectures for understanding and interpreting the Jacobian matrix and the Jacobian determinant.
Rigid and affine transformations
Rigid (LSQ6) and affine (LSQ12) transformations are linear transformations. Linear transformations are also fundamentally described by a vector field \(\vec{X}(\vec{x})\), however they can be simplified in a neat fashion. The “linear” denomination of linear transformations means that the coordinate functions of the vector field are linear functions of the world coordinates, i.e.
\[ \begin{aligned} X(x, y, z) &= j_{xx}x + j_{xy}y + j_{xz}z + b_x \\ Y(x, y, z) &= j_{yx}x + j_{yy}y + j_{yz}z + b_y \\ Z(x, y, z) &= j_{zx}x + j_{zy}y + j_{zz}z + b_z \quad . \end{aligned} \]
Here the values \(b_i\) represent constant terms that play the role of an intercept. The values \(j_{ij}\) are constant terms that play the role of a slope for each variable. Since the vector field is constrained to take this specific form, we can actually express it more elegantly using matrix algebra:
\[\vec{X}(\vec{x}) = \mathbf{J} \cdot \vec{x} + \vec{b} \]
Or explicitly:
\[\begin{bmatrix} X(x, y, z) \\ Y(x, y, z) \\ Z(x, y, z) \end{bmatrix} = \begin{bmatrix} j_{xx} & j_{xy} & j_{xz} \\ j_{yx} & j_{yy} & j_{yz} \\ j_{zx} & j_{zy} & j_{zz} \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} + \begin{bmatrix} b_x \\ b_y \\ b_z \end{bmatrix} \]
The vector \(\vec{b}\) and matrix \(\mathbf{J}\) are populated by numbers and so have no functional dependence on \(\vec{x}\). All the functional dependence arises in the linear relationship with \(\vec{x}\). Notice that there are twelve values that we haven’t specified in \(\vec{b}\) and \(\mathbf{J}\). These are exactly the twelve degrees of freedom that are optimized in LSQ12 registration. In fact, the intercept vector \(\vec{b}\) represents the translation of the image, whereas the matrix \(\mathbf{J}\) contains all the information about rotations, shears and scales. The matrix equation above defines an affine transformation. Another nice thing about linear transformations is that the Jacobian matrix is actually just the matrix of numbers, \(\mathbf{J}\). This can be verified easily by computing the partial derivatives. Since this matrix has no functional dependence on \(\vec{x}\), the values of the matrix are the same across the entire image. This technically still forms a tensor field, but a tensor field with a constant value across space. Of course we can also compute the determinant of the Jacobian matrix. Since the matrix is constant, the Jacobian determinant will also be constant. However, the interpretation of the value of the determinant is still the same as described for non-linear transformations. The difference here is that the same scaling factor will be applied to all volumes across the entire image, rather than allowing for local differences in deformation.
What about rigid transformations? A rigid transformation is actually just an affine transformation with \(\det[\mathbf{J}] = 1\). This means that a rigid transformation is built to preserve volume elements and essentially means that the matrix \(\mathbf{J}\) only describes rotations. With the addition of \(\vec{b}\), translations are also possible.
Summary
We’ve covered a lot of ground so far, so it might be helpful to have a quick summary about how all of this is tied together.
In the case of pairwise image registration, we aim to transform, i.e. deform, the source image so that it is aligned with the target image (or vice versa). Rather than doing this all at once, we break the process down into three transformations: 1. A rigid transformation with LSQ6, 2. an affine transformation with LSQ12, and 3. a regularized non-linear transformation. Rigid transformations are special linear transformations that only allow for translation and rotation of the source image. Affine transformations are general linear transformations that allow for scaling and shearing of the source image, in addition to rotation and translation. Finally, non-linear transformations are general coordinate transformations that allow voxels to move independently of one another. Following these three transformations, the source and target images will be in near-perfect alignment.
In order to make quantitative comparisons between the images, we have to consider the mathematics of the transformations used to align the images. The quantity that we are interested in is the Jacobian determinant scalar field associated with the transformation (note that this is a choice and not the only quantity that can be used to describe morphological change). The Jacobian determinant describes how volume elements contract or expand following a transformation. This field is computed from the Jacobian matrix of the transformation.
Rigid (LSQ6) and affine (LSQ12) transformations are linear transformations, so the Jacobian matrix is simply the rotation/shear/scaling matrix of the transformation. Since this is a constant matrix, i.e. it has no spatial dependence, the Jacobian determinant is also constant. This single value describes how all voxels change following the linear transformation. Keep in mind that rigid transformations have a determinant of 1 by definition and so don’t induce volume changes.
In the more complicated and general case of non-linear transformations, the Jacobian matrix is computed from the partial derivatives of the vector field describing the transformation. In general the Jacobian matrix forms a tensor field that varies over the image, and so the Jacobian determinant will be a scalar field that varies over the image. Rather being a single value that describes how all volumes changes across the image, the Jacobian determinant field describes how individual volume elements change at different positions in the image.
The determinant field calculated from the transformations allows us to analyze how volumes in the source image compare to volumes in the target image. In this way, we can examine changes in brain morphology.
Our exploration of image registration has so far been focused on the problem of aligning and comparing two images. In general, we would like to be able to compare an arbitrary number of brain images in order to perform statistical analyses across groups. The framework for image registration and deformation-based morphometry that was laid out in the previous sections can be applied to the problem of multi-image registration. However, the pipeline must be modified to account for the greater number of images.
One of the central considerations of multi-image registration is the choice of reference space. The reference space is the space of the image to which all other images are being compared. In the special case of pairwise image registration, the choice of reference space is trivial. It can easily be chosen to be either the space of the source image or the target. In multi-image registration, the choice of reference space has a greater impact on the outcome of the registration. As discussed in the Introduction, any given image in the set of images can be chosen as the reference. However this isn’t an optimal choice since 1. all morphometric information will be in reference to this image, and 2. any idiosyncrasies of the image will propagate through the registration and influence the outcome. To get around these problems, we desire a reference space that isn’t highly sensitive to any given image. This is where the consensus average starts to play an important role. The consensus average is built from the transformations of all the individual images in the data set. In a way, it represents a sort of average image of the brains being compared. The consensus average and its associated image space provide a useful reference space in the context of multi-image registration. Since the consensus average is built by aggregating the transformations of the individual images, specific idiosyncrasies tend to get smoothed out (though this may not be so in the case of large structural abnormalities). Moreover, all individual brain images can be transformed to and from the consensus average space, which allows us to make morphometric comparisons in reference to this average.
In the following Sections, we will discuss the process of aligning multiple brain images together in order to generate a high-quality consensus average and associated space, as well as to use deformation-based morphometry across brains.
When the mouse brain images are reconstructed from \(k\)-space following acquisition, they will all be in slightly different orientations. As discussed in the Section on Pairwise image registration, the purpose of the rigid LSQ6 transformation is to perform a coarse alignment of the images. In the case of multi-image registration, it isn’t obvious what the target of this initial alignment should be. The consensus average space has not yet been constructed and so obviously cannot be used as the target. In practice, we use what is called an initial model (colloquially called “init model”). The initial model is a target brain image that serves to define a space towards which to align the individual images.
The initial model serves an additional purpose: It is used to define a transformation between the coordinate system in which the images are acquired, and the coordinate system in which we prefer to perform our analyses. To elaborate, when the mouse brains are scanned in the MRI scanner, the images are acquired in a coordinate system that is defined for the scanner being used. This coordinate system is known as scanner space or native space (native meaning that it is the space in which the images were “born”). However, we prefer to perform our analyses in a coordinate system that is rotated by 180 degrees about the \(z\)-axis. This is simply a choice of convention that corresponds to the orientation in which human scans are acquired. This coordinate system is known as standard space; we can also call it MICe space.
Standard space is related to native space by a 180 degree rotation about the \(z\)-axis
In both native and standard space the correspondence between the coordinate planes and the anatomical planes is as follows:
The difference is that in standard space, the y-axis points in the rostral direction and the x-axis points laterally towards the right.
Transverse section of a 40 micron ex-vivo initial model in both native and standard space. Note that a 180 degree rotation of the coordinate system is equivalent to a 180 degree rotation of the brain itself.
The initial model used to perform LSQ6 registration depends on the type of scan that was performed to acquire the images. In fact there exist a number of initial models for the different scanning protocols that are implemented at MICe. The initial model used will depend on which scanner is used, whether the scan is in- or ex-vivo, the resolution of the scan, and other factors. Every time a new scanning protocol is implemented, a new initial model must be created for that protocol. The process of creating a new initial model is described in the Appendix. The existing initial models can be found at:
/hpf/largeprojects/MICe/tools/initial-models/
To recapitulate: In LSQ6 registration, all individual brain images are rigidly aligned to an appropriate initial model. This alignment is actually performed in native space. Once the images are registered to the initial model in native space, the tranformation from native to standard space is applied to rotate the individual images into the proper orientation. From there, the LSQ6 consensus average is generated by taking a voxel-by-voxel average across the registered images.
Following LSQ6 registration to an initial model, the images are aligned together using LSQ12 registration. Though a consensus average is built from the LSQ6-aligned images, this average actually isn’t used as the target for LSQ12 registration. Rather a new LSQ12 consensus average is generated using an average of all pairwise affine transformations. The way that LSQ12 registration is implemented in the multi-image context is as follows. First, all pairs of images are registered together using LSQ12 as described in the section on Pairwise image registration. This generates a set of optimal transformations between all images.
Representation of all pairwise transformations between three images A, B and C
The transformations from any given image to all other images are then averaged together to generate an average transformation for that image. Consider the image \(A\), the world coordinates of which can be represented by a vector, \(\vec{x}_A\). The optimal affine transformations from \(A\) to \(B\) and \(A\) to \(C\) are maps \(T_{AB}\) and \(T_{AC}\) that take \(\vec{x}_A\) to new coordinates \(\vec{x}_{AB}\) and \(\vec{x}_{AC}\). As discussed above, these are linear transformations so we can write out their functional form:
\[ \begin{aligned} \vec{x}_{AB} &= T_{AB}\left(\vec{x}_A\right) = \mathbf{J}_{AB} \cdot \vec{x}_A + \vec{b}_{AB} \\ \vec{x}_{AC} &= T_{AC}\left(\vec{x}_A\right) = \mathbf{J}_{AC} \cdot \vec{x}_A + \vec{b}_{AC} \quad . \end{aligned} \] We can define an average transformed vector as \(\langle\vec{x}\rangle_A = \frac{1}{2}\left(\vec{x}_{AB} + \vec{x}_{AC}\right)\). This allows us to build an average transformation as follows:
\[ \begin{aligned} \langle\vec{x}\rangle_A &= \frac{1}{2}\left[\vec{x}_{AB} + \vec{x}_{AC}\right] \\ &= \frac{1}{2}\left[T_{AB}\left(\vec{x}_A\right) + T_{AC}\left(\vec{x}_A\right)\right] \\ &= \frac{1}{2}\left[\mathbf{J}_{AB} \cdot \vec{x}_A + \vec{b}_{AB} + \mathbf{J}_{AC} \cdot \vec{x}_A + \vec{b}_{AC} \right] \\ &= \frac{1}{2}\left[\left(\mathbf{J}_{AB} + \mathbf{J}_{AC}\right) \cdot \vec{x}_A + \vec{b}_{AB} + \vec{b}_{AC} \right] \\ &= \frac{1}{2} \left(\mathbf{J}_{AB} + \mathbf{J}_{AC}\right) \cdot \vec{x}_A + \frac{1}{2} \left(\vec{b}_{AB} + \vec{b}_{AC}\right) \\ &= \langle\mathbf{J}\rangle_A\cdot \vec{x}_A + \langle\vec{b}\rangle_A \\ &\equiv \langle T\rangle_A \left(\vec{x}_A\right) \quad . \end{aligned} \] We can think of this average transformation as deforming image \(A\) towards a central image space. The average transformations built for each of the other images in the set will also deform them towards this central space.
Once all individual images are transformed into this central space using these average transformations, the LSQ12 consensus average is generated.
The final step in the pipeline for multi-image registration is to perfect the alignment of all images to the consensus average space using non-linear registration. Here the LSQ12 consensus average is used as the target for the registration. This step in pipeline is performed iteratively in order to refine the alignment of all images to the consensus average space, and to generated a consensus average of higher quality. All images are first aligned to the LSQ12 consensus average using regularized non-linear transformations. These images are then averaged to the generate the first non-linear consensus average, called “nlin1”. The individual images (which are now in LSQ12 consensus average space) are then aligned non-linearly to the “nlin1” consensus average space. A new consensus average is constructed, called “nlin2”. Finally, the process is repeated once more to generated a third and final consensus average, “nlin3”. At the end of this process, all images will be in alignment with the “nlin3” consensus average space. Additionally we will be in possession of a consensus average of very high quality.
As with pairwise image registration, the quantitative map of interest following multi-image registration is the Jacobian determinant field. However a key difference in this case is that the Jacobians describe the volume differences between the individual brains and the consensus average, rather than between any given pair of brains. Thus voxel volumes that are smaller compared to the consensus average will have positive log-determinants, indicating expansion, while volumes that are larger will have negative log-determinants, indicating contraction. Typically, the Jacobian determinants are inverted to indicate how much the consensus average must be expanded or contracted to match an individual brain image.